Search CORE

164 research outputs found

Agents and stream data mining: a new perspective

Author: Lim Ee-Peng
Ng Wee-Keong
Ong Kok-Leong
Zhang Zili
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2005
Field of study

Many organizations struggle with the massive amount of data they collect. Today, data does more than serve as the ingredients for churning out statistical reports. They help support efficient operations in many organizations, and to some extent, data provide the competitive intelligence organizations need to survive in today\u27s economy. Data mining can\u27t always deliver timely and relevant results because data are constantly changing. However, stream-data processing might be more effective, judging by the Matrix project.<br /

Deakin Research Online

Reducing Cognitive Overheads in a Web Warehouse using Reverse-Osmosis

Author: Bhowmick Sourav S.
Lim Ee-Peng
Madria Sanjay Kumar
Ng Wee Keong
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2000
Field of study

This paper provides a quantitative analysis of reducing cognitive overheads in a Web warehouse using an important class of operation called reverse osmosis. The analysis is used to examine two different cognitive overheads of locating relevant nodes or information and display time of a Web table. A reverse-osmosis operation enables us to eliminate in relevant information from a collection of Web documents stored in the form of a Web table. We call such an operation reverse-osmosis because it is analogous to the reverse osmosis process in the field of water purification. We discuss a formal algorithm of the reverse-osmosis operatio

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Cost-benefit Analysis of Web Bag in a Web Warehouse

Author: Bhowmick Sourav S.
Lim Ee-Peng
Madria Sanjay Kumar
Ng Wee Keong
Publication venue: Scholars\u27 Mine
Publication date: 01/01/1999
Field of study

Sets and bags are closely related structures and have been studied in relational databases. A bag is different from a set in that it is sensitive to the number of times an element occurs, while a set is not. In this paper, we introduce the concept of a Web bag in the context of a World Wide Web warehouse called WHOWEDA (WareHouse Of WEb DAta) which we are currently building. Informally, a Web bag is a Web table which allows multiple occurrences of identical Web types. A Web bag helps one to discover useful knowledge from a Web table, such as visible documents or Web sites (i.e. documents/sites which can be reached by many paths), luminous documents (i.e. documents with many outgoing links) and luminous paths (i.e. frequently traversed paths). In this paper, we provide a cost-benefit analysis of materializing Web bags as compared to Web tables with distinct Web tuple

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Cluster-based database selection techniques for routing bibliographic queries

Author: LIM Ee Peng
NG Wee-Keong
XU Jian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/1999
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Scheduling queries to improve the freshness of a website

Author: LIM Ee Peng
LIU Haifeng
NG Wee-Keong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

The WWW is a new advertising media in recent years where corporations utilize it to increase their exposure to consumers. For a very large website whose content is derived from some source database, it is important to maintain its freshness in response to changes to the base data. This issue is particularly signicant for websites presenting fast changing information such as stock exchange information and product information. In this paper, we formally dene and study the freshness of a website that is refreshed by scheduling a set of queries to fetch fresh data from the databases. Then, we propose several online scheduling algorithms and compare the performance of the algorithms on the freshness metric. Our conclusion is veried by empirical results. Keywords: Internet Data Management, View Maintenance, Query Optimization, Hard Real-Time Scheduling 1 Introduction The popularity of the World-Wide Web (WWW) has made it a prime vehicle for disseminating information. More and ..

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Product Schema Integration for Electronic Commerce: A synonym comparison approach

Author: LIM Ee Peng
NG Wee-Keong
YAN Guanghao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2002
Field of study

Institutional Knowledge at Singapore Management University

ViDE: A Visual Data Extraction Environment for the Web

Author: LI Yi
LIM Ee Peng
NG Wee-Keong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2001
Field of study

Institutional Knowledge at Singapore Management University

Hierarchical text classification methods and their specification

Author: LIM Ee Peng
NG Wee-Keong
SUN Aixin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2003
Field of study

Institutional Knowledge at Singapore Management University